FUW TRENDS IN SCIENCE & TECHNOLOGY JOURNAL

(A Peer Review Journal)
e–ISSN: 2408–5162; p–ISSN: 2048–5170

FUW TRENDS IN SCIENCE & TECHNOLOGY JOURNAL

MODELING OUTLIERS IN GAUSSIAN AND NON-GAUSSIAN DISTRIBUTIONS: THE WAVELET APPROACH
Pages: 508-512
Efuwape B. T. 1 *, Aideyan D. O 2 , Abdullah K-K. A. 3 and Efuwape T. O 4


keywords: Modeling; Resolution; Wavelet Analysis; Akaike Information Criteria; Data

Abstract

Aberrant Observations (AOs) are observations that deviate significantly from the majority. They may be generated by a different mechanism corresponding to normal data and may be due to sensor noise, process disturbances, instrument degradation or human related errors. Otherwise, decisions on suspected aberrant observations might be inappropriate. In this paper, we present aberrant observations modeling approach based on wavelet analysis in Gaussian Normal Distribution (ND) and Non-Gaussian Distributions - Contaminated Normal Distribution (CND) and Laplace Distribution (LD). In order to characterize these distributions, a simulation of 508,1020 and 2040 data sets from normal distribution and contaminated with four, four and eight aberrant observations while two real data University College Hospital Ibadan Diabetic Data (UCHDD) and Zadakat Data (ZD) from a local mosque in Ibadan of 128 observations each were analyzed, since wavelet analysis is dyadic. The Mallat algorithm was used to reduce the sizes of the data into smaller resolutions while preserving the desired statistics. In the first three (simulated) series, it was observed that the CND has highest Akaike Information Criterion (AIC) estimates followed by ND and LD hence LD is the most efficient in modeling data in the presence of aberrant observations. From series A (UCHDD) and B (ZD) which are real datasets, the observations were the same as that of simulated datasets except that it was observed that the more the observations, the lower the LD are in modeling aberrant observations.

References

Highlights